Skip to content

Conversation

@Zhou-lemon
Copy link
Contributor

@Zhou-lemon Zhou-lemon commented Dec 30, 2025

What problem does this PR solve?

Issue Number: #58199

Problem Summary: This PR implements the rewrite_manifests procedure for Iceberg tables in Apache Doris. The feature allows users to optimize Iceberg table metadata by rewriting manifest files to improve query performance and reduce metadata overhead. This addresses the need for manifest file optimization in large Iceberg tables where numerous small manifest files can impact query planning performance.

ALTER TABLE catalog.test_db.my_table EXECUTE rewrite_manifests();

+---------------------------+----------------------------+
| rewritten_manifests_count | total_data_manifests_count |
+---------------------------+----------------------------+
|                         3 |                          3 |
+---------------------------+----------------------------+

Release note

Feature Implement rewrite_manifests procedure for Iceberg tables

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Dec 30, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Zhou-lemon
Copy link
Contributor Author

run buildall

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 0.00% (0/94) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 0.00% (0/94) 🎉
Increment coverage report
Complete coverage report

@Zhou-lemon
Copy link
Contributor Author

run performance

@doris-robot
Copy link

TPC-H: Total hot run time: 35722 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit dd2070513296588d1b88b9fd7d0356f9799a6271, data reload: false

------ Round 1 ----------------------------------
q1	17636	4256	4037	4037
q2	2042	367	235	235
q3	10140	1296	736	736
q4	10227	899	328	328
q5	7862	2124	1925	1925
q6	223	170	142	142
q7	968	804	656	656
q8	9306	1452	1158	1158
q9	7014	5136	5198	5136
q10	6847	1812	1405	1405
q11	505	320	303	303
q12	737	752	575	575
q13	17788	3796	3105	3105
q14	293	307	297	297
q15	583	504	510	504
q16	699	693	627	627
q17	716	743	645	645
q18	8097	7733	8132	7733
q19	1223	1042	666	666
q20	417	405	268	268
q21	4505	4199	4234	4199
q22	1175	1074	1042	1042
Total cold run time: 109003 ms
Total hot run time: 35722 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4327	4227	4373	4227
q2	316	426	313	313
q3	2217	2984	2469	2469
q4	1398	1896	1435	1435
q5	4434	4433	4188	4188
q6	225	170	133	133
q7	1988	1909	1766	1766
q8	2627	2455	2375	2375
q9	6966	7220	7064	7064
q10	2455	2681	2147	2147
q11	532	455	454	454
q12	662	694	586	586
q13	3365	3814	3046	3046
q14	278	277	267	267
q15	534	484	488	484
q16	612	659	609	609
q17	1081	1278	1360	1278
q18	7327	7217	7352	7217
q19	878	887	880	880
q20	1872	1962	1787	1787
q21	4571	4321	4147	4147
q22	1092	1047	994	994
Total cold run time: 49757 ms
Total hot run time: 47866 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 175094 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit dd2070513296588d1b88b9fd7d0356f9799a6271, data reload: false

query5	4385	595	448	448
query6	358	255	230	230
query7	4228	467	277	277
query8	348	252	242	242
query9	8757	2650	2648	2648
query10	476	366	326	326
query11	15183	15058	14977	14977
query12	176	115	113	113
query13	1258	500	383	383
query14	6152	3017	2761	2761
query14_1	2674	2630	2645	2630
query15	208	193	176	176
query16	1024	520	478	478
query17	1117	720	604	604
query18	2495	441	350	350
query19	240	229	198	198
query20	132	118	116	116
query21	216	138	119	119
query22	3863	4081	3851	3851
query23	16114	15477	15421	15421
query23_1	15350	15505	15494	15494
query24	7448	1598	1214	1214
query24_1	1223	1226	1243	1226
query25	579	482	441	441
query26	1257	272	164	164
query27	2756	458	301	301
query28	4513	2203	2193	2193
query29	820	572	477	477
query30	322	245	216	216
query31	798	640	564	564
query32	79	71	67	67
query33	555	359	303	303
query34	895	892	547	547
query35	740	838	688	688
query36	874	830	816	816
query37	131	95	81	81
query38	2699	2738	2668	2668
query39	773	754	728	728
query39_1	719	707	733	707
query40	211	129	112	112
query41	64	62	66	62
query42	107	101	104	101
query43	479	443	445	443
query44	1361	764	759	759
query45	184	183	174	174
query46	881	963	626	626
query47	1396	1454	1366	1366
query48	316	325	263	263
query49	589	413	319	319
query50	655	290	205	205
query51	3737	3732	3721	3721
query52	98	108	98	98
query53	298	325	280	280
query54	290	264	248	248
query55	88	74	72	72
query56	285	288	286	286
query57	998	1038	934	934
query58	270	289	252	252
query59	1991	2121	2108	2108
query60	312	312	292	292
query61	167	154	159	154
query62	396	383	308	308
query63	295	264	273	264
query64	4952	1308	1029	1029
query65	3813	3751	3716	3716
query66	1434	434	318	318
query67	14874	14896	15309	14896
query68	8172	998	722	722
query69	495	347	313	313
query70	1067	916	872	872
query71	375	298	276	276
query72	5919	4770	4729	4729
query73	703	615	312	312
query74	8900	8804	8602	8602
query75	2917	2863	2517	2517
query76	3797	1074	689	689
query77	507	361	271	271
query78	9683	9806	9204	9204
query79	1246	957	634	634
query80	647	583	506	506
query81	506	260	230	230
query82	237	146	112	112
query83	267	262	248	248
query84	257	133	103	103
query85	900	512	463	463
query86	413	318	317	317
query87	2808	2863	2741	2741
query88	3289	2284	2289	2284
query89	403	354	344	344
query90	2171	165	155	155
query91	174	172	145	145
query92	82	67	63	63
query93	1207	916	565	565
query94	568	326	302	302
query95	587	384	311	311
query96	596	480	209	209
query97	2334	2406	2326	2326
query98	242	208	194	194
query99	557	602	511	511
Total cold run time: 253766 ms
Total hot run time: 175094 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.01 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit dd2070513296588d1b88b9fd7d0356f9799a6271, data reload: false

query1	0.05	0.05	0.05
query2	0.10	0.05	0.05
query3	0.25	0.08	0.08
query4	1.61	0.11	0.12
query5	0.26	0.25	0.25
query6	1.14	0.66	0.66
query7	0.03	0.02	0.02
query8	0.06	0.04	0.04
query9	0.57	0.51	0.52
query10	0.56	0.55	0.56
query11	0.15	0.10	0.11
query12	0.15	0.12	0.14
query13	0.60	0.59	0.61
query14	0.99	0.99	0.97
query15	0.79	0.78	0.80
query16	0.40	0.40	0.39
query17	1.02	1.06	1.05
query18	0.23	0.22	0.21
query19	1.83	1.87	1.87
query20	0.02	0.02	0.01
query21	15.46	0.29	0.14
query22	4.77	0.05	0.04
query23	15.92	0.29	0.10
query24	1.40	0.35	0.49
query25	0.11	0.10	0.06
query26	0.14	0.13	0.14
query27	0.06	0.08	0.05
query28	4.35	1.05	0.88
query29	12.60	3.96	3.15
query30	0.28	0.14	0.13
query31	2.84	0.61	0.37
query32	3.23	0.56	0.46
query33	3.02	3.03	3.11
query34	16.60	5.10	4.43
query35	4.45	4.49	4.43
query36	0.67	0.50	0.50
query37	0.10	0.06	0.07
query38	0.08	0.04	0.03
query39	0.04	0.03	0.03
query40	0.16	0.15	0.14
query41	0.08	0.03	0.03
query42	0.04	0.03	0.03
query43	0.04	0.04	0.03
Total cold run time: 97.25 s
Total hot run time: 27.01 s

@Zhou-lemon
Copy link
Contributor Author

@suxiaogang223 Could you please review this PR when you have time?

@suxiaogang223
Copy link
Contributor

@suxiaogang223 Could you please review this PR when you have time?

Cool👍. Sorry for the late reply, I will review it as soon as possible.

@suxiaogang223
Copy link
Contributor

I've reviewed the current implementation against the official Apache Iceberg Spark procedures documentation, and I'd like to suggest some changes to better align with the upstream specification.


1. Parameter Design

Spark Official Implementation

Argument Name Required? Type Description
table ✔️ string Name of the table to update
use_caching boolean Use Spark caching during operation (defaults to false). Enabling caching can increase memory footprint on executors.
spec_id int Spec id of the manifests to rewrite (defaults to current spec id)

Current Doris Implementation

Parameter Type Description
cluster-by-partition boolean Cluster manifests by partition fields
rewrite-all boolean Rewrite all manifests when true
min-manifest-size-bytes long Minimum manifest file size threshold
max-manifest-size-bytes long Maximum manifest file size threshold
scan-thread-pool-size int Thread pool size for parallel scanning

Issue & Recommendation

The current implementation is over-engineered compared to the official specification.

I believe we should simplify the implementation to only support the basic rewrite_manifests functionality. The extensive filtering parameters for manifest files are unnecessary because:

  1. Manifest files are typically very small - They contain metadata references to data files, not the actual data itself. Most manifest files range from a few KB to a few MB.

  2. The primary purpose of rewrite_manifests is to merge ALL small manifest files into larger ones for better scan planning. This is a metadata optimization operation that should process all manifests by default.

  3. Over-filtering defeats the purpose - The operation is meant to be comprehensive, not selective. If we only rewrite some manifests based on size thresholds, we miss the optimization benefits.

Proposed Changes

I recommend removing the following parameters:

  • cluster-by-partition (not in Spark spec - clustering is automatic based on partition spec)
  • rewrite-all (not in Spark spec - all manifests should be rewritten by default)
  • min-manifest-size-bytes (not in Spark spec - unnecessary complexity)
  • max-manifest-size-bytes (not in Spark spec - unnecessary complexity)
  • scan-thread-pool-size (not in Spark spec - engine-specific optimization)

Recommended Parameter Set

Parameter Required? Type Description
table ✔️ string Name of the table to update (inferred from ALTER TABLE)
spec_id int Spec id of the manifests to rewrite (defaults to current spec id)

Note: We could add spec_id as an optional parameter if needed for advanced use cases, but the core functionality should remain simple.


2. Return Value Schema

Spark Official Output

Output Name Type Description
rewritten_manifests_count int Number of manifests which were re-written by this command
added_manifests_count int Number of new manifest files which were written by this command

Current Doris Output

Column Name Type Description
rewritten_manifests_count int Number of data manifests rewritten by this command
total_data_manifests_count int Total number of data manifests BEFORE rewrite

Issue

The second output column is inconsistent with the official specification. The Spark version returns added_manifests_count (the number of new manifests created after the rewrite), not the total count before the operation.

Recommended Change

// Current implementation
return Lists.newArrayList(
    new Column("rewritten_manifests_count", Type.INT, false,
        "Number of data manifests rewritten by this command"),
    new Column("total_data_manifests_count", Type.INT, false,  // ❌ Incorrect
        "Total number of data manifests before rewrite")
);

// Recommended change - align with Spark spec
return Lists.newArrayList(
    new Column("rewritten_manifests_count", Type.INT, false,
        "Number of manifests which were re-written by this command"),
    new Column("added_manifests_count", Type.INT, false,  // ✅ Aligned with spec
        "Number of new manifest files which were written by this command")
);

Note: The added_manifests_count should reflect the actual number of new manifest files created by the rewrite operation, which is typically smaller than the original count.


3. Summary of Recommended Changes

Aspect Current Recommended Reason
Parameters 5 custom params 0-1 params (optional spec_id) Align with Spark spec; manifest files are small and don't need filtering
Return Column 2 total_data_manifests_count added_manifests_count Match official output schema
Functionality Complex with filtering logic Simple: merge all manifests Core purpose is optimization through consolidation
Implementation Custom predicate-based selection Direct Iceberg API usage Leverage upstream RewriteManifests action

4. Benefits of Alignment

Compatibility - Users familiar with Spark/Iceberg will have a consistent experience across engines
Simplicity - Easier to understand, test, and maintain
Performance - No overhead from unnecessary filtering logic
Documentation - Can reference official docs directly without translation
Future-proof - Automatically benefits from upstream improvements


5. Additional Considerations

5.1 Engine-Specific Optimizations

Parameters like scan-thread-pool-size are engine-specific and should be handled at the engine configuration level, not as procedure parameters. Doris can internally manage thread pools for optimal performance without exposing this to users.

5.2 Backward Compatibility

If there are concerns about removing existing parameters, we could:

  1. Deprecate them with warnings in a minor release
  2. Remove them in a future major release
  3. Document that they are ignored (no-op) with a deprecation warning

5.3 Implementation Reference

The recommended approach is to use Iceberg's RewriteManifests action directly:

RewriteManifests rm = table.rewriteManifests();
if (specId != null) {
    rm.specId(specId);
}
rm.commit();

This is simpler and more aligned with the official specification than the current predicate-based approach.


6. Example Usage (After Changes)

-- Basic usage - rewrite all manifests for current spec
CALL catalog_name.system.rewrite_manifests('db.table');

-- Or via ALTER TABLE (Doris syntax)
ALTER TABLE catalog_name.db.table EXECUTE rewrite_manifests();

-- Optional: rewrite manifests for a specific spec
ALTER TABLE catalog_name.db.table EXECUTE rewrite_manifests('spec_id' = '1');

@suxiaogang223
Copy link
Contributor

@Zhou-lemon I think our actions need to be consistent with Spark's official Iceberg implementation. Of course, some Spark-related parameters may not need to be implemented. You can see https://iceberg.apache.org/docs/latest/spark-procedures/#rewrite_manifests

@morningman morningman self-assigned this Jan 14, 2026
@Zhou-lemon
Copy link
Contributor Author

run buildall

@Zhou-lemon
Copy link
Contributor Author

run buildall

@Zhou-lemon
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31572 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a919bcc6d86c387cc0a15e7aff5cc3b2089b0108, data reload: false

------ Round 1 ----------------------------------
q1	17686	4270	4016	4016
q2	2065	384	259	259
q3	10111	1265	734	734
q4	10207	792	304	304
q5	7487	2089	1812	1812
q6	187	170	137	137
q7	1057	818	642	642
q8	9275	1383	1153	1153
q9	4853	4617	4556	4556
q10	6817	1798	1421	1421
q11	538	289	288	288
q12	759	739	606	606
q13	17802	3890	3112	3112
q14	286	293	279	279
q15	600	518	513	513
q16	684	690	626	626
q17	654	745	534	534
q18	6943	6446	6511	6446
q19	1561	975	622	622
q20	387	373	250	250
q21	3016	2421	2289	2289
q22	1056	1024	973	973
Total cold run time: 104031 ms
Total hot run time: 31572 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4158	4049	4064	4049
q2	330	399	344	344
q3	2125	2593	2234	2234
q4	1369	1770	1348	1348
q5	4118	3992	4150	3992
q6	213	169	129	129
q7	1924	1838	1659	1659
q8	2511	2395	2303	2303
q9	6707	6729	6624	6624
q10	2291	2561	2102	2102
q11	533	465	454	454
q12	655	708	607	607
q13	3344	3840	3115	3115
q14	307	285	265	265
q15	537	512	500	500
q16	634	661	605	605
q17	1090	1248	1278	1248
q18	7870	7276	7306	7276
q19	830	786	808	786
q20	1894	1996	1776	1776
q21	4464	4250	4119	4119
q22	1029	1030	972	972
Total cold run time: 48933 ms
Total hot run time: 46507 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 174172 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a919bcc6d86c387cc0a15e7aff5cc3b2089b0108, data reload: false

query5	5197	673	499	499
query6	346	246	221	221
query7	4224	477	261	261
query8	340	250	244	244
query9	8739	2884	2847	2847
query10	530	400	340	340
query11	15338	15253	14755	14755
query12	194	114	107	107
query13	1244	472	406	406
query14	6772	3046	2789	2789
query14_1	2713	2656	2658	2656
query15	201	188	175	175
query16	989	461	505	461
query17	1047	650	550	550
query18	2699	426	338	338
query19	214	218	191	191
query20	131	115	112	112
query21	225	141	115	115
query22	3891	3889	3976	3889
query23	16031	15511	15334	15334
query23_1	15347	15379	15374	15374
query24	7131	1542	1143	1143
query24_1	1154	1177	1164	1164
query25	563	439	395	395
query26	1234	264	144	144
query27	2769	450	275	275
query28	4580	2118	2103	2103
query29	752	521	431	431
query30	314	241	210	210
query31	816	624	577	577
query32	94	75	75	75
query33	540	342	305	305
query34	916	872	543	543
query35	714	751	681	681
query36	865	910	813	813
query37	136	102	84	84
query38	2795	2720	2706	2706
query39	776	751	736	736
query39_1	717	712	701	701
query40	220	134	118	118
query41	67	65	63	63
query42	106	99	103	99
query43	453	431	438	431
query44	1317	734	728	728
query45	195	187	179	179
query46	831	955	595	595
query47	1456	1462	1398	1398
query48	325	328	237	237
query49	629	444	364	364
query50	630	266	208	208
query51	3799	3770	3842	3770
query52	106	109	100	100
query53	288	323	279	279
query54	309	281	275	275
query55	86	83	76	76
query56	335	340	316	316
query57	1011	1031	964	964
query58	292	269	276	269
query59	2161	2025	2128	2025
query60	350	362	337	337
query61	190	196	188	188
query62	410	345	322	322
query63	301	271	278	271
query64	5021	1414	1105	1105
query65	3821	3758	3796	3758
query66	1401	434	335	335
query67	15603	15576	15382	15382
query68	2478	1101	734	734
query69	450	361	342	342
query70	1010	863	963	863
query71	334	329	294	294
query72	5463	3235	3325	3235
query73	609	720	310	310
query74	8744	8712	8567	8567
query75	2742	2856	2458	2458
query76	2283	1052	654	654
query77	356	372	317	317
query78	9789	9938	9189	9189
query79	1150	924	584	584
query80	1358	577	492	492
query81	555	260	231	231
query82	1286	143	110	110
query83	343	251	236	236
query84	260	121	102	102
query85	913	521	464	464
query86	419	296	292	292
query87	2901	2858	2794	2794
query88	3503	2589	2555	2555
query89	385	351	330	330
query90	1983	176	169	169
query91	175	161	143	143
query92	75	72	72	72
query93	1117	930	521	521
query94	661	330	306	306
query95	585	334	325	325
query96	642	496	226	226
query97	2366	2392	2358	2358
query98	217	204	195	195
query99	605	564	522	522
Total cold run time: 249833 ms
Total hot run time: 174172 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.24 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit a919bcc6d86c387cc0a15e7aff5cc3b2089b0108, data reload: false

query1	0.05	0.06	0.05
query2	0.11	0.05	0.04
query3	0.26	0.08	0.08
query4	1.62	0.12	0.11
query5	0.29	0.25	0.26
query6	1.15	0.66	0.65
query7	0.03	0.02	0.02
query8	0.05	0.03	0.04
query9	0.57	0.51	0.52
query10	0.55	0.55	0.55
query11	0.15	0.10	0.10
query12	0.14	0.11	0.12
query13	0.60	0.58	0.59
query14	0.95	0.94	0.95
query15	0.79	0.77	0.78
query16	0.41	0.42	0.40
query17	1.08	1.09	1.05
query18	0.23	0.21	0.22
query19	1.94	1.89	1.88
query20	0.02	0.01	0.05
query21	15.46	0.29	0.15
query22	5.21	0.05	0.05
query23	16.09	0.28	0.09
query24	1.10	0.79	0.57
query25	0.13	0.12	0.07
query26	0.13	0.13	0.13
query27	0.08	0.05	0.06
query28	4.62	1.06	0.88
query29	12.61	3.95	3.20
query30	0.28	0.13	0.12
query31	2.81	0.62	0.40
query32	3.24	0.56	0.45
query33	3.05	3.00	3.02
query34	16.39	5.05	4.45
query35	4.44	4.46	4.52
query36	0.66	0.50	0.50
query37	0.11	0.07	0.06
query38	0.07	0.04	0.04
query39	0.04	0.02	0.02
query40	0.17	0.14	0.12
query41	0.09	0.04	0.03
query42	0.04	0.03	0.03
query43	0.05	0.03	0.03
Total cold run time: 97.86 s
Total hot run time: 27.24 s

@suxiaogang223
Copy link
Contributor

Looks like perfect :)

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 0.00% (0/67) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 80.60% (54/67) 🎉
Increment coverage report
Complete coverage report

Copy link
Contributor

@suxiaogang223 suxiaogang223 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 16, 2026
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@morningman morningman merged commit df9a8e6 into apache:master Jan 16, 2026
28 of 29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.1.x reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants